Idempotent Expansions for Continuous-Time Stochastic Control
نویسندگان
چکیده
It is now well-known that many classes of deterministic control problems may be solved by max-plus or minplus (more generally, idempotent) numerical methods. These methods include max-plus basis-expansion approaches [1], [2], [6], [9], as well as the more recently developed curseof-dimensionality-free methods [9], [14]. It has recently been discovered that idempotent methods are applicable to stochastic control and games. The methods are related to the above curse-of-dimensionality-free methods for deterministic control. In particular, a min-plus based method was found for stochastic control problems [10], [15], and a min-max method was discovered for games [11]. The first such methods for stochastic control were developed only for discrete-time problems. The key tools enabling their development were the idempotent distributive property and the fact that certain solution forms are retained through application of the semigroup operator (i.e., the dynamic programming principle operator). In particular, under certain conditions, pointwise minima of affine and quadratic forms pass through this operator. As the operator contains an expectation component, this requires application of the idempotent distributive property. In the case of finite sums and products, this property looks like our standardalgebra distributive property; in the infinitesimal case, it is familiar to control theorists through notions of strategies, non-anticipative mappings and/or progressively measurable controls. Using this technology, the value function can be propagated backwards with a representation as a pointwise minimum of quadratic or affine forms. Here, we will remove the severe restriction to discrete-time problems. This extension requires overcoming significant technical hurdles. First, note that as these methods are related to the max-plus curse-of-dimensionality-free methods of deterministic control, there will be a discretization over time, but not over space. We will first define a parameterized set of operators, approximating the dynamic programming operator. We obtain the solutions to the problem of backward propagation by repeated application of the approximating operators. These solutions are parameterized by the timediscretization step size. Using techniques from the theory of viscosity solutions, we show that the solutions converge to the viscosity solution of the Hamilton-Jacobi-Bellman partial differential equation (HJB PDE) associated with the original problem. The problem is now reduced to backward propagation by these approximating operators. The min-plus distributive property is employed. A generalization of this distributive property, applicable to continuum versions will be obtained. This will allow interchange of expectation over normal random variables (and other random variables with range in IR) with infimum operators. At each time-step, the solution will be represented as an infimum over a set of quadratic forms. Use of the min-plus distributive property will allow us to maintain that solution form as one propagates backward in time. Backward propagation is reduced to simple standardsense linear algebraic operations for the coefficients in the representation. We also demonstrate that the assumptions on the representation which allow one to propagate backward one step are inherited by the representation at the next step. The difficulty with the approach is an extreme curse-ofcomplexity, wherein the number of terms in the min-plus expansion grows very rapidly as one propagates. The complexity growth will be attenuated via projection onto a lower dimensional min-plus subspace at each time step. At each step, one desires to project onto the optimal subspace relative to the solution approximation. That is, the subspace is not set a priori. In the discrete-time case, it has been demonstrated that for some problem classes, this approach is substantially superior to grid-based methods. Simple numerical examples with continuous-time dynamics will be examined with this new approach.
منابع مشابه
Distributed dynamic programming for discrete-time stochastic control, and idempotent algorithms
Previously, idempotent methods have been found to be extremely fast for solution of dynamic programming equations associated with deterministic control problems. The original methods exploited the idempotent (e.g., max-plus) linearity of the associated semigroup operator. However, it is now known that the curse-of-dimensionality-free idempotent methods do not require this linearity. Instead, it...
متن کاملIdempotent Method for Continuous-Time Stochastic Control and Complexity Attenuation
We consider a min-plus based numerical method for solution of finite timehorizon control of nonlinear diffusion processes. The approach belongs to the class of curseof-dimensionality-free methods. The min-plus distributive property is required. The price to pay is a very heavy curse-of-complexity. These methods perform well due to the complexityattenuation step. This projects the solution down ...
متن کاملOptimal Stochastic Control in Continuous Time with Wiener Processes: General Results and Applications to Optimal Wildlife Management
We present a stochastic optimal control approach to wildlife management. The objective value is the present value of hunting and meat, reduced by the present value of the costs of plant damages and traffic accidents caused by the wildlife population. First, general optimal control functions and value functions are derived. Then, numerically specified optimal control functions and value func...
متن کاملEvaluation of the Lyapunov Exponent for Stochastic Dynamical Systems with Event Synchronization
We consider stochastic dynamical systems operating under synchronization constraints on system events. The system dynamics is represented by a linear vector equation in an idempotent semiring through second-order state transition matrices with both random and constant entries. As the performance measure of interest, the Lyapunov exponent defined as the asymptotic mean growth rate of the system ...
متن کاملMarket Adaptive Control Function Optimization in Continuous Cover Forest Management
Economically optimal management of a continuous cover forest is considered here. Initially, there is a large number of trees of different sizes and the forest may contain several species. We want to optimize the harvest decisions over time, using continuous cover forestry, which is denoted by CCF. We maximize our objective function, the expected present value, with consideration of stochastic p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- SIAM J. Control and Optimization
دوره 54 شماره
صفحات -
تاریخ انتشار 2016